GGUF
MLX
•
qwen3
Distilled version of the DeepSeek-R1-0528 model, created by continuing the post-training process on the Qwen3 8B Base model using Chain-of-Thought (CoT) from DeepSeek-R1-0528.
MLX
GGUF
•
mistral
Devstral by MistralAI is based on Mistral Small 3.1. Debuts as the #1 open source model on SWE-bench.
GGUF
MLX
•
phi-4
Lightweight open model from the Phi-4 family
GGUF
MLX
•
phi-4
Advanced open-weight reasoning model, finetuned from Phi-4 with additional reinforcement learning for higher accuracy
GGUF
MLX
•
qwen3moe
The 235B parameter (MoE) version of the Qwen3 model family.
GGUF
MLX
•
qwen3
The 32B parameter version of the Qwen3 model family.
GGUF
MLX
•
qwen3moe
The 30B parameter (MoE) version of the Qwen3 model family.
GGUF
MLX
•
qwen3
The 1.7B parameter version of the Qwen3 model family.
GGUF
MLX
•
qwen3
The 4B parameter version of the Qwen3 model family.
GGUF
MLX
•
qwen3
The 14B parameter version of the Qwen3 model family.
GGUF
MLX
•
qwen3
The 8B parameter version of the Qwen3 model family.
GGUF
MLX
•
gemma3
State-of-the-art image + text input models from Google, built from the same research and tech used to create the Gemini models
GGUF
MLX
•
gemma3
State-of-the-art image + text input models from Google, built from the same research and tech used to create the Gemini models
GGUF
MLX
•
gemma3
State-of-the-art image + text input models from Google, built from the same research and tech used to create the Gemini models
GGUF
MLX
•
gemma3
Tiny text-only variant of Gemma 3: Google's latest open-weight model family
GGUF
MLX
•
qwen2
Reasoning model from the Qwen family, rivaling DeepSeek R1 on benchmarks.
GGUF
•
granite
A small and capable LLM from IBM
GGUF
•
qwen2vl
a 7B Vision Language Model (VLM) from the Qwen2.5 family
GGUF
•
phi
The latest in the Phi model series: suitable for chats with a context of up to 16K tokens
GGUF
•
granite
Dense LLM from IBM supporting up to 128K context length, trained on 12T tokens. Suitable for general instructions following and can be used to build AI assistants
GGUF
•
llama
Meta's latest Llama 70B model, matches the performance of Llama 3.2 405B
GGUF
MLX
•
qwen2
14B version of the code-specific Qwen 2.5 for code generation, code reasoning and code fixing.
GGUF
MLX
•
qwen2
32B version of the code-specific Qwen 2.5 for code generation, code reasoning and code fixing.
GGUF
•
mistral
A slightly larger 12B parameter model from Mistral AI, NeMo offers a long 128k token context length, advanced world knowledge, and function calling for developers.
GGUF
•
mistral
A scientific specialist finetune of Mistral AI's popular 7B model, Mathstral excels at STEM chats and tasks.
GGUF
•
gemma2
The mid-sized option of the Gemma 2 model family. Built by Google, using from the same research and technology used to create the Gemini models
GGUF
•
gemma2
The large option of the Gemma 2 model family. Built by Google, using from the same research and technology used to create the Gemini models
GGUF
•
mistral
Mistral AI's latest coding model, Codestral can handle both instructions and code completions with ease in over 80 programming languages.
GGUF
•
mistral
One of the most popular open-source LLMs, Mistral's 7B Instruct model's balance of speed, size, and performance makes it a great general-purpose daily driver.